AIbase
Home
AI Tools
AI Models
MCP
AI NEWS
EN
Model Selection
Tags
High-Precision Description

# High-Precision Description

Longva 7B TPO
MIT
LongVA-7B-TPO is a video-text model derived from LongVA-7B through temporal preference optimization, excelling in long video understanding tasks.
Video-to-Text Transformers
L
ruili0
225
1
Cogflorence 2.2 Large
MIT
This model is a fine-tuned version of microsoft/Florence-2-large, trained on a 40,000-image subset of the Ejafa/ye-pop dataset, with annotation texts generated by THUDM/cogvlm2-llama3-chat-19B, suitable for image-to-text tasks.
Image-to-Text Transformers Supports Multiple Languages
C
thwri
20.64k
33
Git Base Next Refined
MIT
Fine-tuned image-to-text model based on microsoft/git-base
Large Language Model Transformers Other
G
swaroopajit
24
0
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
English简体中文繁體中文にほんご
© 2025AIbase